Reproducible experiments on Three-Dimensional Entity Resolution with JedAI
نویسندگان
چکیده
In Papadakis et al. (2020), we presented the latest release of JedAI, an open-source Entity Resolution (ER) system that allows for building a large variety end-to-end ER pipelines. Through thorough experimental evaluation, compared schema-agnostic pipeline based on blocks with another schema-based similarity joins. We applied them to 10 established, real-world datasets and assessed respect effectiveness time efficiency. Special care was taken juxtapose their scalability, too, using seven synthetic datasets. Moreover, experimentally batch its progressive counterpart. this companion paper, describe how reproduce entire study pertains JedAI’s serial execution through intuitive user interface. also explain examine robustness parameter configurations have selected.
منابع مشابه
JedAI: The Force Behind Entity Resolution
We present JedAI, a toolkit for Entity Resolution that can be used in three different ways: as an open-source Java library that implements numerous state-of-the-art, domain-independent methods, as a workbench that facilitates the evaluation of their relative performance and as a desktop application that offers out-of-the-box ER solutions. JedAI bridges the gap between the database and the Seman...
متن کاملENCORE: Experiments with a Synthetic Entity Co-reference Resolution Tool
We present ENCORE, a system for entity co-reference resolution that synthesizes the outputs of several off-the-shelf co-reference resolution systems. To boost precision, we filter the output using a named entity recognition tool called SYNERGY which itself is a synthesis of several off-the-shelf NER systems. ENCORE is designed to work under two conditions: NP-CR which resolves noun phrase co-re...
متن کاملMakeSense: Managing Reproducible WSNs Experiments
Wireless Sensor Networks (WSN) users often use simulation campaigns before real deployment to evaluate performance and to finetune application and network parameters. This process requires repeating the same experiments under similar conditions and to collect, parse and present data efficiently. This paper introduces MakeSense: a tool that automates this workflow and that allows reproducing sim...
متن کاملCalibrating MoonGen for Reproducible Experiments
which are needed for high-speed operation mode, require purchasing a license. In netmap user space applications do not have direct access to the NIC’s registers. This is a safety precaution as a misconfigured NIC can crash the whole system by corrupting memory [20]. This restriction in netmap is critical as it is designed to be included in an operating system: netmap is already part of the Free...
متن کاملComputational experiments on three-dimensional molecular diffusion in porous media
Estimations of apparent diffusion coefficients usually consist of curve-fitting the output of 1-D models to experimental laboratory-measured data from porous aggregates shaped in different forms. In this research, a computational exploration is presented on the alternative use of three-dimensional models for the same purpose. The outputs of the 3-D models were compared to results generated by o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information Systems
سال: 2021
ISSN: ['0306-4379', '1873-6076']
DOI: https://doi.org/10.1016/j.is.2021.101830